feat: add executable Tier 2 Agent Teams patterns by maystudios · Pull Request #195 · maystudios/maxsimcli

maystudios · 2026-03-25T22:00:10Z

Summary

Replace prose-only Tier 2 descriptions in execute.md section 6.3 with concrete TeamCreate/SendMessage call syntax for the competitive implementation debate pattern
Add three complete Tier 2 workflow patterns to maxsim-batch/SKILL.md: competitive implementation (debate), multi-reviewer code review (cross-checking), and collaborative debugging (adversarial hypothesis testing)
Each pattern includes: TeamCreate call, teammate spawn with role prompts, SendMessage exchange, verifier resolution, and Tier 1 graceful degradation fallback
Remove "planned but not yet implemented" disclaimer from SKILL.md

Addresses PROJECT.md §7.2 audit gap (Parallelism PARTIAL 1-3).

Test plan

Unit tests pass (550/550)
No lint regressions from template changes
Tool call syntax matches agent-teams-guide.md API reference (SendMessage parameters: type, recipient, content, summary)
Graceful degradation section preserved and enhanced with per-pattern fallbacks
Tier selection table retained unchanged
Manual review: confirm patterns are consistent with AGENTS.md Tier 2 architecture

🤖 Generated with Claude Code

…xsim-batch SKILL.md Replace prose descriptions of Tier 2 competitive implementation with concrete TeamCreate/SendMessage call syntax in execute.md section 6.3. Add three complete Tier 2 workflow patterns to maxsim-batch SKILL.md: competitive implementation (debate), multi-reviewer code review (cross-checking), and collaborative debugging (adversarial hypothesis testing). Each pattern includes TeamCreate, teammate spawn, SendMessage exchange, verifier resolution, and Tier 1 graceful degradation fallback. Removes "planned but not yet implemented" disclaimer. Addresses PROJECT.md §7.2 audit gap (Parallelism PARTIAL 1-3). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-25T22:02:53Z

🎉 This PR is included in version 5.15.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Copilot

Pull request overview

This PR upgrades Tier 2 Agent Teams guidance from prose to concrete, copy/pasteable workflow patterns so orchestrators can apply TeamCreate/SendMessage-based collaboration (with Tier 1 fallbacks) during execution, reviews, and debugging.

Changes:

Replaces the Tier 2 “debate” description in execute.md with a step-by-step TeamCreate + teammate spawn + SendMessage critique + verifier selection flow (with Tier 1 fallback).
Adds three complete Tier 2 patterns to maxsim-batch/SKILL.md (competitive implementation, multi-reviewer cross-checking, collaborative debugging), including activation checks and per-pattern Tier 1 degradations.
Removes the “planned but not yet implemented” disclaimer from SKILL.md.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
templates/workflows/execute.md	Adds concrete Tier 2 competitive debate steps (TeamCreate/SendMessage/verifier) and explicit Tier 1 fallback path.
templates/skills/maxsim-batch/SKILL.md	Documents executable Tier 2 Agent Teams patterns + activation check + strengthened graceful-degradation guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-25T22:04:26Z

templates/workflows/execute.md

+   SendMessage({
+     type: "message",
+     recipient: "competitor-b",
+     content: "Review competitor-a's implementation. Identify weaknesses, edge cases missed, and potential issues. Be adversarial -- find real problems, not style preferences. Report: (1) correctness issues, (2) missing edge cases, (3) maintainability concerns.",
+     summary: "Requesting adversarial review of competitor-a's work"
+   })
+
+   SendMessage({
+     type: "message",
+     recipient: "competitor-a",
+     content: "Review competitor-b's implementation. Identify weaknesses, edge cases missed, and potential issues. Be adversarial -- find real problems, not style preferences. Report: (1) correctness issues, (2) missing edge cases, (3) maintainability concerns.",
+     summary: "Requesting adversarial review of competitor-b's work"
+   })
+   ```
+   Each teammate responds with a structured critique. This fights LLM anchoring bias -- the first plausible answer does not automatically win.
+
+   **Step 2d -- Verifier selects winner:**
+   Spawn a fresh verifier agent (NOT a team member) to evaluate both implementations and both critiques:
+   ```
+   Agent(
+     subagent_type: "verifier",
+     model: "{verifier_model}",
+     prompt: "
+       You are judging a competitive implementation. Two (or three) agents each implemented the same task independently, then reviewed each other's work adversarially.
+
+       ## Implementations
+       - competitor-a (CONSERVATIVE): {summary or path to worktree-a}
+       - competitor-b (INNOVATIVE): {summary or path to worktree-b}
+
+       ## Critiques
+       - competitor-b's critique of competitor-a: {critique-b-of-a}
+       - competitor-a's critique of competitor-b: {critique-a-of-b}
+
+       ## Selection Criteria (in priority order)
+       1. Correctness -- does it satisfy all success criteria?
+       2. Test coverage -- are edge cases tested?
+       3. Code quality -- readability, maintainability, consistency with codebase
+       4. Simplicity -- prefer fewer abstractions when correctness is equal
+
+       Output exactly: WINNER: competitor-{a|b|c}
+       Followed by a justification paragraph.
+     "
+   )
+   ```


Step 2c says “each reviews the others’ work”, but the example SendMessage exchange only covers competitor-a ↔ competitor-b. If competitor-c is spawned (critical tasks), it would receive no critique request and its feedback won’t be available to the verifier. Expand the debate phase to include competitor-c (e.g., round-robin critiques) or explicitly state that Tier 2 debate is only for 2 competitors.

Suggested change

SendMessage({

type: "message",

recipient: "competitor-b",

content: "Review competitor-a's implementation. Identify weaknesses, edge cases missed, and potential issues. Be adversarial -- find real problems, not style preferences. Report: (1) correctness issues, (2) missing edge cases, (3) maintainability concerns.",

summary: "Requesting adversarial review of competitor-a's work"

})

SendMessage({

type: "message",

recipient: "competitor-a",

content: "Review competitor-b's implementation. Identify weaknesses, edge cases missed, and potential issues. Be adversarial -- find real problems, not style preferences. Report: (1) correctness issues, (2) missing edge cases, (3) maintainability concerns.",

summary: "Requesting adversarial review of competitor-b's work"

})

```

Each teammate responds with a structured critique. This fights LLM anchoring bias -- the first plausible answer does not automatically win.

**Step 2d -- Verifier selects winner:**

Spawn a fresh verifier agent (NOT a team member) to evaluate both implementations and both critiques:

```

Agent(

subagent_type: "verifier",

model: "{verifier_model}",

prompt: "

You are judging a competitive implementation. Two (or three) agents each implemented the same task independently, then reviewed each other's work adversarially.

## Implementations

- competitor-a (CONSERVATIVE): {summary or path to worktree-a}

- competitor-b (INNOVATIVE): {summary or path to worktree-b}

## Critiques

- competitor-b's critique of competitor-a: {critique-b-of-a}

- competitor-a's critique of competitor-b: {critique-a-of-b}

## Selection Criteria (in priority order)

1. Correctness -- does it satisfy all success criteria?

2. Test coverage -- are edge cases tested?

3. Code quality -- readability, maintainability, consistency with codebase

4. Simplicity -- prefer fewer abstractions when correctness is equal

Output exactly: WINNER: competitor-{a|b|c}

Followed by a justification paragraph.

"

)

```

Copilot · 2026-03-25T22:04:26Z

templates/workflows/execute.md

+       Output exactly: WINNER: competitor-{a|b|c}
+       Followed by a justification paragraph.


The verifier prompt instructs “Output exactly: WINNER: competitor-{a|b|c}” and also asks for a justification paragraph. “Output exactly” is ambiguous/contradictory here and can cause downstream parsers to fail if they expect a single-line output. Clarify that the first line must be exactly the WINNER line, followed by free-form justification.

Suggested change

Output exactly: WINNER: competitor-{a|b|c}

Followed by a justification paragraph.

On the first line, output exactly one of: WINNER: competitor-a, WINNER: competitor-b, or WINNER: competitor-c

After that first line, output a justification paragraph explaining your choice.

Copilot · 2026-03-25T22:04:26Z

templates/workflows/execute.md

+     subagent_type: "verifier",
+     model: "{verifier_model}",


This section mixes tool-call argument styles (earlier in the file Agent(...) is shown with subagent_type="executor" and run_in_background=true, but this new snippet uses subagent_type: "verifier", model: "...", commas, etc.). Since this is meant to be executable syntax, please standardize on one argument format within execute.md to avoid copy/paste errors.

Suggested change

subagent_type: "verifier",

model: "{verifier_model}",

subagent_type="verifier",

model="{verifier_model}",

Copilot · 2026-03-25T22:04:26Z

templates/skills/maxsim-batch/SKILL.md

+TeamCreate(team_name: "probe-{timestamp}", description: "availability check")
+# If probe fails, set TIER=1 and log reason


The Tier 2 activation “probe TeamCreate (lightweight — create and immediately clean up)” example never shows the cleanup step. As written it will leave probe teams under ~/.claude/teams//~/.claude/tasks/ on every run. Add an explicit TeamDelete step (or a deterministic probe name + delete) so the probe is actually lightweight/idempotent.

Suggested change

TeamCreate(team_name: "probe-{timestamp}", description: "availability check")

# If probe fails, set TIER=1 and log reason

probe_name = "probe-tier2-activation"

TeamCreate(team_name: probe_name, description: "availability check")

TeamDelete(team_name: probe_name)

# If probe fails at any step, set TIER=1 and log reason

Copilot · 2026-03-25T22:04:27Z

templates/skills/maxsim-batch/SKILL.md

+// (Optional -- critical tasks only) Teammate C -- defensive approach
+Spawn teammate "competitor-c" with prompt:
+  "Implement {task_description} using approach: DEFENSIVE.
+   Maximize error handling, edge case coverage, and robustness.
+   Work in isolation until the review phase.
+   Phase: {N}, Plan: {id}, Issue: #{phase_issue_number}.
+   Success criteria: {criteria from plan}.
+   When done, commit your work and report RESULT: PASS or RESULT: FAIL."
+Model: {executor_model}
+```
+
+**Step 3 -- Adversarial critique via SendMessage:**
+After all teammates complete, each reviews the others' implementations:
+
+```
+SendMessage({
+  type: "message",
+  recipient: "competitor-b",
+  content: "Review competitor-a's implementation. Be adversarial: (1) correctness issues, (2) missing edge cases, (3) maintainability concerns. Find real problems, not style preferences.",
+  summary: "Requesting adversarial review of competitor-a"
+})
+
+SendMessage({
+  type: "message",
+  recipient: "competitor-a",
+  content: "Review competitor-b's implementation. Be adversarial: (1) correctness issues, (2) missing edge cases, (3) maintainability concerns. Find real problems, not style preferences.",
+  summary: "Requesting adversarial review of competitor-b"
+})
+```


Pattern 1 allows spawning an optional competitor-c for critical tasks, but the SendMessage critique examples only request reviews between competitor-a and competitor-b. If competitor-c participates, include critique exchanges involving competitor-c (and pass those critiques into the verifier prompt), or state that the debate pattern is strictly 2-way.

Copilot · 2026-03-25T22:04:27Z

templates/skills/maxsim-batch/SKILL.md

+```
+SendMessage({
+  type: "message",
+  recipient: "competitor-b",
+  content: "Review competitor-a's implementation. Be adversarial: (1) correctness issues, (2) missing edge cases, (3) maintainability concerns. Find real problems, not style preferences.",
+  summary: "Requesting adversarial review of competitor-a"
+})


These Tier 2 patterns use the SendMessage({ type, recipient, content, summary }) schema, but docs/spec/agent-teams-research.md documents a newer v2.1.75+ schema using to/message/summary and calls out a breaking change. To avoid shipping “executable” examples that may be wrong depending on runtime version, please reconcile the repo docs (pick one schema + version guard, or note both with guidance on which to use).

Copilot AI review requested due to automatic review settings March 25, 2026 22:00

Copilot started reviewing on behalf of maystudios March 25, 2026 22:00 View session

maystudios merged commit 135c5e3 into main Mar 25, 2026
3 checks passed

github-actions bot added the released label Mar 25, 2026

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add executable Tier 2 Agent Teams patterns#195

feat: add executable Tier 2 Agent Teams patterns#195
maystudios merged 1 commit intomainfrom
worktree-agent-a63098aa

maystudios commented Mar 25, 2026

Uh oh!

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		Output exactly: WINNER: competitor-{a\|b\|c}
		Followed by a justification paragraph.

		TeamCreate(team_name: "probe-{timestamp}", description: "availability check")
		# If probe fails, set TIER=1 and log reason

-TeamCreate(team_name: "probe-{timestamp}", description: "availability check")
-# If probe fails, set TIER=1 and log reason
+probe_name = "probe-tier2-activation"
+TeamCreate(team_name: probe_name, description: "availability check")
+TeamDelete(team_name: probe_name)
+# If probe fails at any step, set TIER=1 and log reason

Conversation

maystudios commented Mar 25, 2026

Summary

Test plan

Uh oh!

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants